52 research outputs found

    Phylogenomics and Genome Annotation

    Get PDF
    International audienc

    Repository/R-Forge/DateTimeStamp 2012-12-11 16:03:18

    Get PDF
    Suggests ade4, segmented Description Exploratory data analysis and data visualization for biological sequence (DNA and protein) data. Include also utilities for sequence data management under the ACNUC system. License GPL (> = 2

    Assessing Recent Selection and Functionality at Long Non-Coding RNA Loci in the Mouse Genome

    Get PDF
    This work was supported by the Biotechnology and Biological Sciences Research Council and The Wellcome Trust. A.N. was supported by the Swiss National Science Foundation (Grant: PZ00P3_142636). H.K. was supported by the European Research Council Starting (Grant: 242597, SexGenTransEvolution) and the Swiss National Science Foundation (Grants: 130287 and 146474).Long noncoding RNAs (lncRNAs) are one of the most intensively studied groups of noncoding elements. Debate continues over what proportion of lncRNAs are functional or merely represent transcriptional noise. Although characterization of individual lncRNAs has identified approximately 200 functional loci across the Eukarya, general surveys have found only modest or no evidence of long-term evolutionary conservation. Although this lack of conservation suggests that most lncRNAs are nonfunctional, the possibility remains that some represent recent evolutionary innovations. We examine recent selection pressures acting on lncRNAs in mouse populations. We compare patterns of within-species nucleotide variation at approximately 10,000 lncRNA loci in a cohort of the wild house mouse, Mus musculus castaneus, with between-species nucleotide divergence from the rat (Rattus norvegicus). Loci under selective constraint are expected to show reduced nucleotide diversity and divergence. We find limited evidence of sequence conservation compared with putatively neutrally evolving ancestral repeats (ARs). Comparisons of sequence diversity and divergence between ARs, protein-coding (PC) exons and lncRNAs, and the associated flanking regions, show weak, but significantly lower levels of sequence diversity and divergence at lncRNAs compared with ARs. lncRNAs conserved deep in the vertebrate phylogeny show lower within-species sequence diversity than lncRNAs in general. A set of 74 functionally characterized lncRNAs show levels of diversity and divergence comparable to PC exons, suggesting that these lncRNAs are under substantial selective constraints. Our results suggest that, in mouse populations, most lncRNA loci evolve at rates similar to ARs, whereas older lncRNAs tend to show signals of selection similar to PC genes.PostprintPeer reviewe

    The fitness cost of mis-splicing is the main determinant of alternative splicing patterns

    Get PDF
    Background Most eukaryotic genes are subject to alternative splicing (AS), which may contribute to the production of protein variants or to the regulation of gene expression via nonsense-mediated messenger RNA (mRNA) decay (NMD). However, a fraction of splice variants might correspond to spurious transcripts and the question of the relative proportion of splicing errors to functional splice variants remains highly debated. Results We propose a test to quantify the fraction of AS events corresponding to errors. This test is based on the fact that the fitness cost of splicing errors increases with the number of introns in a gene and with expression level. We analyzed the transcriptome of the intron-rich eukaryote Paramecium tetraurelia. We show that in both normal and in NMD-deficient cells, AS rates strongly decrease with increasing expression level and with increasing number of introns. This relationship is observed for AS events that are detectable by NMD as well as for those that are not, which invalidates the hypothesis of a link with the regulation of gene expression. Our results show that in genes with a median expression level, 92–98% of observed splice variants correspond to errors. We observed the same patterns in human transcriptomes and we further show that AS rates correlate with the fitness cost of splicing errors. Conclusions These observations indicate that genes under weaker selective pressure accumulate more maladaptive substitutions and are more prone to splicing errors. Thus, to a large extent, patterns of gene expression variants simply reflect the balance between selection, mutation, and drift

    Etude des patrons d'évolution asymétrique dans les séquences d'ADN

    No full text
    This thesis analyses the effect of two essential cellular mechanisms, replication and transcription, on the base composition of DNA sequences. These two processes function asymmetrically on the two DNA strands, and they have as a consequence an asymmetric nucleotide composition in genomic sequences. Moreover, they act in a coordinated manner on genomes, thus the estimation of their respective effects can be difficult. First, we studied the co-orientation between replication and transcription in prokaryotes. We proposed a method for the study of base composition biases which can separate these two sources of asymmetry. We show that the effect of replication on base composition can be highly variable even between closely related species. We then studied the substitution pattern in transcribed regions and around replication origins, in human, with special emphasis on the effect of the 5'-3' nucleotide context. Patterns of context-dependent asymmetric substitutions are similar for replication and transcription. The variation of substitution rate with the expression pattern suggests the presence of context-dependent asymmetric repair mechanisms. We proposed a computational approach for the study of the substitution pattern in microsatellites. We prove that transcribed microsatellites are subject to asymmetric evolution.Cette thĂšse Ă©tudie l'effet de deux processus cellulaires essentiels, la rĂ©plication et la transcription, sur la composition en nuclĂ©otides des sĂ©quences d'ADN. Ces mĂ©canismes ont un fonctionnement asymĂ©trique par rapport aux deux brins d'ADN, et ils ont comme consĂ©quence une composition asymĂ©trique dans les sĂ©quences. Nous avons Ă©tudiĂ© la co-orientation entre rĂ©plication et transcription chez les procaryotes. Nous proposons une mĂ©thode pour l'Ă©tude des biais de composition qui dĂ©couple ces deux sources d'asymĂ©trie. Nous montrons que les biais associĂ©s Ă  la rĂ©plication sont trĂšs variables, mĂȘme entre espĂšces proches. Nous avons ensuite analysĂ© le patron de substitution dans les rĂ©gions transcrites et autour des origines de rĂ©plication du gĂ©nome humain, et notamment l'effet du contexte 5'-3'. Les biais de voisinage sont similaires pour l'asymĂ©trie associĂ©e Ă  la rĂ©plication et Ă  la transcription. La variation des taux de substitutions en fonction du patron d'expression des gĂšnes suggĂšre qu'un biais de rĂ©paration asymĂ©trique et contexte-dĂ©pendant pourrait ĂȘtre en jeu. Enfin, nous avons proposĂ© une mĂ©thode de calcul du patron de substitution dans des sĂ©quences Ă  composition biaisĂ©e: les microsatellites. Nous avons dĂ©montrĂ© que les microsatellites transcrits sont sujets au mĂȘmes processus asymĂ©triques que les rĂ©gions non-rĂ©pĂ©tĂ©es
    • 

    corecore